Search CORE

12 research outputs found

Execution strategies for SQL subqueries

Author: César A. Galindo-legaria
Milind M. Joshi
Mostafa Elhemali
Torsten Grabs
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2007
Field of study

Optimizing SQL subqueries has been an active area in database research and the database industry throughout the last decades. Pre-vious work has already identified some approaches to efficiently execute relational subqueries. For satisfactory performance, proper choice of subquery execution strategies becomes even more essen-tial today with the increase in decision support systems and auto-matically generated SQL, e.g., with ad-hoc reporting tools. This goes hand in hand with increasing query complexity and growing data volumes – which all pose challenges for an industrial-strength query optimizer. This current paper explores the basic building blocks that Microsoft SQL Server utilizes to optimize and execute relational subqueries. We start with indispensable prerequisites such as detection and removal of correlations for subqueries. We identify a full spectrum of fundamental subquery execution strategies such as forward and reverse lookup as well as set-based approaches, explain the different execution strategies for subqueries implemented in SQL Server, and relate them to the current state of the art. To the best of our knowl-edge, several strategies discussed in this paper have not been pub-lished before. An experimental evaluation complements the paper. It quantifies the performance characteristics of the different approaches and shows that indeed alternative execution strategies are needed in different circumstances, which make a cost-based query optimizer indispen-sable for adequate query performance

CiteSeerX

Crossref

Outerjoins as disjunctions

Author: Ceri S.
César A. Galindo-Legaria
Date C. J.
Date C.J.
Dayal U.
Galindo-Legaria C.A.
Hartley T.
Mater D.
Muralikrishna M.
Rosenthal A.
Ullman I. D.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/1994
Field of study

Crossref

CWI's Institutional Repository

PIVOT and UNPIVOT: Optimization and Execution Strategies in an RDBMS

Author: Conor Cunningham
César A. Galindo-legaria
Goetz Graefe
Publication venue: Morgan Kaufmann
Publication date: 01/01/2004
Field of study

PIVOT and UNPIVOT, two operators on tabular data that exchange rows and columns, enable data transformations useful in data modeling, data analysis, and data presentation. They can quite easily be implemented inside a query processor, much like select, project, and join. Such a design provides opportunities for better performance, both during query optimization and query execution. We discuss query optimization and execution implications of this integrated design and evaluate the performance of this approach using a prototype implementation in Microsoft SQL Server. 1

CiteSeerX

Crossref

The Complexity of Transformation-Based Join Enumeration

Author: Arjan Pellenkoft
César A. Galindo-Legaria
Martin Kersten
Publication venue
Publication date: 01/01/1997
Field of study

Query optimizers that explore a search space exhaustively using transformation rules usually apply all possible rules on each alternative, and stop when no new information is produced. A memoizing structure was proposed in [McK93] to improve the re-use of common subexpression, thus improving the efficiency of the search considerably. However, a question that remained open is, what is the complexity of the transformation-based enumeration process ? In particular, with n the number of relations, does it achieve the O(3 n ) lower bound established by [OL90]? In this paper we examine the problem of duplicates, in transformation-based enumeration. In general, different sequences of transformation rules may end up deriving the same element, and the optimizer must detect and discard these duplicate elements generated by multiple paths. We show that the usual commutativity/associativity rules for joins generate O(4 n ) duplicate opera- Permission to copy without fee all or part of this ma..

CiteSeerX

CWI's Institutional Repository

Duplicate-free Generation of Alternatives in Transformation-based Optimizers

Author: Arjan Pellenkoft
C&apos
César A. Galindo-Legaria
Martin Kersten
Publication venue
Publication date
Field of study

Transformation-based optimizers that explore a search space exhaustively usually apply all possible transformation rules on each alternative, and stop when no new information is produced. In general, different sequences of transformation rules may end up deriving the same element. The optimizer must detect and discard these duplicate elements generated by multiple paths. In this paper we consider two questions: How bad is the overhead of duplicate generation? And then, how can it be avoided? We use a restricted class of join reordering to illustrate the problem. For the first question, our analysis shows that as queries get larger, the number of duplicates is several times that of the new elements. And even for small queries, duplicates are generated more often than new elements. For the second question, we describe a technique to avoid generating duplicates, based on keeping track of (a summary of) the derivation history of each element. Keywords Query optimization, Transformationbas..

CiteSeerX

Counting, enumerating, and sampling of execution plans in a cost-based query optimizer

Author: César Galindo-Legaria
Florian Waas
Graefe G.
McKenna W. J.
Ono K.
Pellenkoft A.
Slutz D.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/1999
Field of study

Testing an SQL database system by running large sets of deterministic or stochastic SQL statements is common practice in commercial database development. However, code defects often remain undetected as the query optimizer's choice of an execution plan is not only depending on the query but strongly influenced by a large number of parameters describing the database and the hardware environment. Modifying these parameters in order to steer the optimizer to select other plans is difficult since this means anticipating often complex search strategies implemented in the optimizer. In this paper we devise algorithms for counting, exhaustive generation, and uniform sampling of plans from the complete search space. Our techniques allow extensive validation of both generation of alternatives, and execution algorithms with plans other than the optimized one---if two candidate plans fail to produce the same results, then either the optimizer considered an invalid plan, or the execution code is faulty. When the space of alternatives becomes too large for exhaustive testing, which can occur even with a handful of joins, uniform random sampling provides a mechanism for unbiased testing. The technique is implemented in Microsoft's SQL Server, where it is an integral part of the validation and testing process

Crossref

CWI's Institutional Repository